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1.  PROBLEM  STATEMENT 


1.1.  Overview 


One  of  the  few  common  threads  throughout  most  distributed  database  research  projects  has  been  the 
assumption  that  the  network  topology  is  stable.  However,  mobility  and  portability  are  often  essential  features 
of  network  nodes,  especially  (but  not  exclusively)  in  military  applications.  This  project  investigated  both 
basic  and  applied  research  problems  which  arise  when  the  presupposition  of  fixed  topology  is  removed.  The 


goal  of  this  effort  was  to  explore  the  essential  issues  of  dynamically  reconfigurable  database  systems  and  to 
develop  a  design  strategy  for  such  systems. 


1.2.  General  System  Characteristics 

This  study  was  targeted  toward  networks  with  the  following  characteristics: 

(1)  The  network  consists  of  two  types  of  nodes  -  servers  and  clients.  The  servers  offer  specialized 
functionality  which  may  be  accessed  by  any  authorized  node. 

(2)  Access  to  a  server  may  be  initiated  by  any  node  (client  or  another  server).  Access  to  a  client  may,  in 
general,  be  initiated  only  by  a  server. 

(3)  The  database  management  facilities  will  be  provided  by  a  database  server.  Other  servers  will  provide 
functions  such  as  naming,  authentication,  and  network  monitoring. 

(4)  Any  node  may  disappear  from  the  network  (become  unreachable)  at  any  time  with  the  expectation  that  it 
will  reappear  after  an  unspecified  period  of  time,  possibly  at  a  new  physical  location. 


(5)  A  link  between  anv  two  nodes  is  transient  even  while  both  nodes  are  connected  to  the  network  and  lasts 
only  for  the  duration  of  a  session  (initiated  by  one  of  the  nodes). 


(6)  The  two  nodes  engaged  in  a  session  may  or  may  not  be  reachable  by  other  nodes  during  the  session, 
depending  on  their  level  of  sophistication  (e.g.  number  of  dial  in  ports). 


(7)  The  majority  of  network  entities  are  highly  mobile  client  nodes  which  occasionally  appear,  interact  with  oJ»s 
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the  network,  and  then  depart.  Dlut  I  Special 
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(8)  Server  nodes  are  also  mobile;  however  (hev  will  rend  to  be  more  stable  than  the  client  nodes. 

(9)  Though  each  server  is  a  separate  and  unique  logical  entity,  each  mav  be  implemented  on  one  or  more 

physical  nodes,  and  a  single  node  mav  offer  multiple  services. 

At  the  highest  level,  the  network  consists  of  a  collection  of  mobile  nodes  which  communicate  on 
dynamically  established  link'.  The  datatype  and  otiiet  major  uetwcuk  facilities  ate  conuuu-J  •  ui  a  group  u 
server  nodes  which,  although  mobile,  have  a  high  degree  of  availabilitv .  As  ,  i  oi.se.iuence,  seiver 
functionality  will  be  replicated.  Client  nodes  appear  at  different  remote  locations,  attach  to  a  -or-er  for  the 
purposes  of  either  adding  information  to  the  database  or  querving  it'  content',  and  then  depatt.  In  oinc 
situations,  clients  mav  pose  a  query,  depart,  and  then  return  at  another  U\  .itcn  to  toque--;  the  result.  The 
remote  client  nodes  may  exchange  message'  through  a  po't  office,  out  thete  o  no  dire,  i  client  communication 
during  a  session. 
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1.3.  Objectives  -  Scope  and  Issues 
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The  dynamic  nature  of  a  system  as  outlined  above  imposes  special  constraints  on  the  designer 
Solutions  and  methods  that  apply  for  distributed  database  systems  that  are  built  on  top  c-t  static  or  quasi  static 
communication  networks  do  not,  in  general,  apply  for  a  system  built  on  top  of  a  highly  dvnanuc  network.  It 
was  therefore  necessary  to  investigate  the  set  of  research  issues  listed  below-  to  determine  the  impact  o*’  a 
dynamic  topology: 
data  allocation, 
query  optimization , 
concurrency  control, 
survivability, 
network  management, 
communication  protocols, 
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- itnuIntRui  mcTii-  d, -h  gv 


i  *  >  *  J 


2.  RESULTS 
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2.1.  Data  allocation 

A  number  of  dynamic  data  allocation  strategies  for  dynamically  reconfigurable  distributed  database' 
have  been  developed.  The  primary  tenet  underlying  the  data  allocation  strategy  is  that  reallocation  should 
take  place  only  when  overall  system  performance  would  be  improved  or  if  the  number  of  copies  of  a  file  falls 
below  a  pre-specified  minimum.  The  allocation  algorithms  choose  the  best  pos'ible  assignment  using 
heuristic  benefit  functions  and  greedy  search  strategies.  The  goal  of  the  technique  is  to  'elect  an  assignment 
as  close  as  possible  to  optimal  while  minimizing  search  time.  Performance  experiments  indicated  that  the 
allocations  chosen  by  these  algorithms  were  on  the  average  within  2%  of  the  optimal  as'ignment  vxhiie  ii'ing 
orders  of  magnitude  less  CPU  time.  These  results  are  reported  in  detail  in  |CP  4|.  The  overall 
performance  of  the  algorithms  was  then  enhanced  by  the  development  of  techniques  for  the  parallel 
reorganization  of  the  database  when  reallocation  is  necessary,  see  [CP-2). 


2.2.  Query  optimization 

Two  query  optimization  techniques  developed  for  static  distributed  database  systems  were  modified  to 
consider  the  problems  of  a  dynamic  environment.  The  strategies  were  the  dynamic  execution  semijoin 
approach  of  Yu  and  Chang  and  the  bucket  semijoin  technique  of  Krishnamurthv  and  Morgan,  experimental 
results  indicated  that  while  the  dynamic  nature  of  the  network  impacted  the  performance  of  both  strategies, 
the  dynamic  execution  approac  h  had  a  smaller  increase  in  tunning  time.  Detail'  of  thi'  Mudv  appeal  in  |  1  R 


2.3.  Concurrency  control 

I  he  metla.de h.'gv  developed  for  coiu ut  rein  >  coiitiol  in  a  dvnatiiKallv  lecontigurai-ie  etc.  lu  iimeui  e  me 
Enhanced  Multiversion  Timestamp  (  LMT  )  approach  |T-5|  which  is  an  exteri'ion  of  Reed’'  timestamping 
algorithm.  LM  T  contain'  a  'penal  VIEW  operation  which  permits  leading  ■<(  data  v*. :  1 1 1<  -  n  t  ifg.iid  toi  n k'. 
handles  read  only  tiaii'.n  r i . 1 1 '  -ihsiiH'.,  and  inempts  to  avoid  tedoing  \uite  ope:  atnoi'  wn-tie .  t  ;■  -.-.me 
Performance  e.xper imeiiis  indicate  th.it  the  above  enhancement  improve  both  the  average  and  «aN  v.oe 
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behavior  of  transactions  operating  under  this  currency  control  approach  as  compared  with  basic 
timestamping. 

2.4.  Survivability 

The  database  management  system  must  be  able  to  withstand  the  departure  of  anv  number  of  client  nodes 
and  any  single  server  node.  Initial  work  :n  this  area  was  predicated  on  the  assumption  that  the  underiving 
communication  media  would  preclude  network  partitioning.  This  assumption  is  valid  in  many  networks  such 
as  common  carrier  phone  links.  In  this  context,  algorithms  for  reliable  transaction  management  and  commit 
were  developed  to  handle  both  planned  and  unexpected  departures  of  server  nodes  |CP-7|.  The  transaction 
management  algorithm  can  tolerate  up  to  k  simultaneous  failures,  providing  that  k  is  less  than  the  total 
number  of  database  server  nodes.  The  commit  protocol  requires  only  the  existence  of  one  server  for  each 
data  object  in  the  read  or  write  set  at  the  appropriate  time  in  the  protocol. 

The  problem  of  partitioning  was  treated  by  a  different  set  of  algorithms  (CP- 1  .CP- J ]  which  made  an 
initial  assumption  of  a  hierarchical  topology.  The  algorithms  provide  take  an  optimistic  approach  to  the 
partitioning  problem,  allowing  work  in  individual  partitions  to  proceed  as  far  as  possible  and  then  tei  onciling 
differences  at  reconnection.  The  partitioning  algorithms  are  unique  in  that  they  permit  nodes  to  reattach  at 
points  other  than  their  point  of  departure,  thus  supporting  complete  reconfiguration.  The  placement  of 
network  monitors  is  a  key  element  of  the  partition  handling  strategy.  An  algorithm  has  been  developed  for 
the  optimal  placement  of  network  monitors  in  a  hierarchical  network. 


2.5.  Network  Management 

The  responsibility  for  the  management  of  the  network  resides  with  the  Network  Monitor  ami  N  line 
Servers.  The  design  and  implementation  of  the  network  management  facility  is  report  in  |TR  5|.  These 
servers  provide  for  both  planned  and  unexpected  departures  from  and  returns  to  the  network. 
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communication.  Details  of  these  ims  level  ntoWs'l'  ate  pr ovided  m  I  I  4j. 

In  the  communication  model  followed  in  this  project,  a  session  laver  mediates  between  the  low  level 
communication  links  and  the  high-level  applications.  The  protocols  developed  |T  1]  enable  processes  to 
return  to  and  depart  from  the  network,  10  establish  sessions  between  air.  two  ppvesses.  u,,)  t,  -  no  m.l 
receive  messages. 

2.7.  Routing 

The  highlv  dynamic  nature  of  the  network  considered  nece'sitaied  the  development  ■  it  a  new  ippa  si.  h 
to  routing.  The  strategy  followed  involved  (he  enhancement  of  Flovd’s  basic  path  algorithm  with  alternate 
routing  procedures  |T-2|  which  seek'  new  subpaths  when  a  topoh  gicjl  . hange  etft\t'  a  given  toute. 
Simulations  indicate  that  this  approach  produces  substantial  per  for  mats  :e  benefit'  in  a  netwoiK  with  a 
dynamic  topology. 

2.8.  Simulation  Methodology 

Simulation  has  proven  to  be  a  valuable  tool  in  the  projection  of  the  performance  of  many  of  the  new 
algorithms  developed  during  this  project  (CP  h\.  During  the  course  of  these  simulation  experiments,  a 
number  of  shortcomings  of  existing  simulation  techniques  wete  noted.  Consequently,  a  simulation  package 
oriented  toward  the  simulation  of  distributed  systems  has  been  developed  |\  '  11.  Among  the  special  featutes 
of  this  package  are  portability,  interfaces  special  purpose  simulation  language  with  C,  direct  mtetfjie  to 
operating  system,  extensibility,  special  constucts  to  support  simulation  of  distributed  systems,  generation  of 
multiple  plots,  interactive  debugger,  control  of  statistics  produced. 
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